Overview

Dataset statistics

Number of variables11
Number of observations20433
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.7 MiB
Average record size in memory88.0 B

Variable types

Numeric10
Categorical1

Alerts

longitude is highly correlated with latitudeHigh correlation
latitude is highly correlated with longitudeHigh correlation
total_rooms is highly correlated with total_bedrooms and 2 other fieldsHigh correlation
total_bedrooms is highly correlated with total_rooms and 2 other fieldsHigh correlation
population is highly correlated with total_rooms and 2 other fieldsHigh correlation
households is highly correlated with total_rooms and 2 other fieldsHigh correlation
median_income is highly correlated with median_house_valueHigh correlation
median_house_value is highly correlated with median_incomeHigh correlation
longitude is highly correlated with latitudeHigh correlation
latitude is highly correlated with longitudeHigh correlation
total_rooms is highly correlated with total_bedrooms and 2 other fieldsHigh correlation
total_bedrooms is highly correlated with total_rooms and 2 other fieldsHigh correlation
population is highly correlated with total_rooms and 2 other fieldsHigh correlation
households is highly correlated with total_rooms and 2 other fieldsHigh correlation
median_income is highly correlated with median_house_valueHigh correlation
median_house_value is highly correlated with median_incomeHigh correlation
longitude is highly correlated with latitudeHigh correlation
latitude is highly correlated with longitudeHigh correlation
total_rooms is highly correlated with total_bedrooms and 2 other fieldsHigh correlation
total_bedrooms is highly correlated with total_rooms and 2 other fieldsHigh correlation
population is highly correlated with total_rooms and 2 other fieldsHigh correlation
households is highly correlated with total_rooms and 2 other fieldsHigh correlation
df_index is highly correlated with longitude and 3 other fieldsHigh correlation
longitude is highly correlated with df_index and 3 other fieldsHigh correlation
latitude is highly correlated with df_index and 3 other fieldsHigh correlation
total_rooms is highly correlated with total_bedrooms and 2 other fieldsHigh correlation
total_bedrooms is highly correlated with total_rooms and 2 other fieldsHigh correlation
population is highly correlated with total_rooms and 2 other fieldsHigh correlation
households is highly correlated with total_rooms and 2 other fieldsHigh correlation
median_income is highly correlated with median_house_valueHigh correlation
median_house_value is highly correlated with df_index and 4 other fieldsHigh correlation
ocean_proximity is highly correlated with df_index and 3 other fieldsHigh correlation
df_index is uniformly distributed Uniform
df_index has unique values Unique

Reproduction

Analysis started2022-03-21 14:45:02.677471
Analysis finished2022-03-21 14:45:42.889439
Duration40.21 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct20433
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10316.17609
Minimum0
Maximum20639
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size159.8 KiB
2022-03-21T20:15:43.194707image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1027.6
Q15162
median10319
Q315473
95-th percentile19601.4
Maximum20639
Range20639
Interquartile range (IQR)10311

Descriptive statistics

Standard deviation5956.699278
Coefficient of variation (CV)0.5774134938
Kurtosis-1.198669631
Mean10316.17609
Median Absolute Deviation (MAD)5156
Skewness-0.0007732714102
Sum210790426
Variance35482266.29
MonotonicityStrictly increasing
2022-03-21T20:15:43.474800image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
137491
 
< 0.1%
137561
 
< 0.1%
137551
 
< 0.1%
137541
 
< 0.1%
137531
 
< 0.1%
137521
 
< 0.1%
137511
 
< 0.1%
137501
 
< 0.1%
137481
 
< 0.1%
Other values (20423)20423
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
206391
< 0.1%
206381
< 0.1%
206371
< 0.1%
206361
< 0.1%
206351
< 0.1%
206341
< 0.1%
206331
< 0.1%
206321
< 0.1%
206311
< 0.1%
206301
< 0.1%

longitude
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct844
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-119.5706886
Minimum-124.35
Maximum-114.31
Zeros0
Zeros (%)0.0%
Negative20433
Negative (%)100.0%
Memory size159.8 KiB
2022-03-21T20:15:43.722841image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-124.35
5-th percentile-122.47
Q1-121.8
median-118.49
Q3-118.01
95-th percentile-117.08
Maximum-114.31
Range10.04
Interquartile range (IQR)3.79

Descriptive statistics

Standard deviation2.003577891
Coefficient of variation (CV)-0.01675643014
Kurtosis-1.332548154
Mean-119.5706886
Median Absolute Deviation (MAD)1.29
Skewness-0.2961409006
Sum-2443187.88
Variance4.014324364
MonotonicityNot monotonic
2022-03-21T20:15:43.962903image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-118.31159
 
0.8%
-118.3157
 
0.8%
-118.29146
 
0.7%
-118.27141
 
0.7%
-118.32141
 
0.7%
-118.28139
 
0.7%
-118.35138
 
0.7%
-118.36135
 
0.7%
-118.19134
 
0.7%
-118.25126
 
0.6%
Other values (834)19017
93.1%
ValueCountFrequency (%)
-124.351
 
< 0.1%
-124.32
 
< 0.1%
-124.271
 
< 0.1%
-124.261
 
< 0.1%
-124.251
 
< 0.1%
-124.233
< 0.1%
-124.221
 
< 0.1%
-124.213
< 0.1%
-124.194
< 0.1%
-124.186
< 0.1%
ValueCountFrequency (%)
-114.311
 
< 0.1%
-114.471
 
< 0.1%
-114.491
 
< 0.1%
-114.551
 
< 0.1%
-114.561
 
< 0.1%
-114.573
< 0.1%
-114.582
< 0.1%
-114.591
 
< 0.1%
-114.63
< 0.1%
-114.613
< 0.1%

latitude
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct861
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.63322126
Minimum32.54
Maximum41.95
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size159.8 KiB
2022-03-21T20:15:44.202982image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum32.54
5-th percentile32.82
Q133.93
median34.26
Q337.72
95-th percentile38.96
Maximum41.95
Range9.41
Interquartile range (IQR)3.79

Descriptive statistics

Standard deviation2.136347666
Coefficient of variation (CV)0.05995381812
Kurtosis-1.119522552
Mean35.63322126
Median Absolute Deviation (MAD)1.23
Skewness0.464934277
Sum728093.61
Variance4.563981352
MonotonicityNot monotonic
2022-03-21T20:15:44.467049image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34.06241
 
1.2%
34.08232
 
1.1%
34.05229
 
1.1%
34.07227
 
1.1%
34.04215
 
1.1%
34.09209
 
1.0%
34.02207
 
1.0%
34.1201
 
1.0%
34.03189
 
0.9%
33.93181
 
0.9%
Other values (851)18302
89.6%
ValueCountFrequency (%)
32.541
 
< 0.1%
32.553
 
< 0.1%
32.5610
 
< 0.1%
32.5718
0.1%
32.5826
0.1%
32.5911
0.1%
32.69
 
< 0.1%
32.6114
0.1%
32.6213
0.1%
32.6318
0.1%
ValueCountFrequency (%)
41.952
< 0.1%
41.921
 
< 0.1%
41.881
 
< 0.1%
41.863
< 0.1%
41.841
 
< 0.1%
41.821
 
< 0.1%
41.812
< 0.1%
41.83
< 0.1%
41.791
 
< 0.1%
41.783
< 0.1%

housing_median_age
Real number (ℝ≥0)

Distinct52
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.63309353
Minimum1
Maximum52
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size159.8 KiB
2022-03-21T20:15:44.675099image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q118
median29
Q337
95-th percentile52
Maximum52
Range51
Interquartile range (IQR)19

Descriptive statistics

Standard deviation12.5918052
Coefficient of variation (CV)0.439764051
Kurtosis-0.8010133431
Mean28.63309353
Median Absolute Deviation (MAD)10
Skewness0.06160542583
Sum585060
Variance158.5535582
MonotonicityNot monotonic
2022-03-21T20:15:45.059178image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
521265
 
6.2%
36856
 
4.2%
35818
 
4.0%
16762
 
3.7%
17694
 
3.4%
34682
 
3.3%
26611
 
3.0%
33609
 
3.0%
25562
 
2.8%
32560
 
2.7%
Other values (42)13014
63.7%
ValueCountFrequency (%)
14
 
< 0.1%
258
 
0.3%
362
 
0.3%
4190
0.9%
5242
1.2%
6157
0.8%
7173
0.8%
8203
1.0%
9204
1.0%
10263
1.3%
ValueCountFrequency (%)
521265
6.2%
5147
 
0.2%
50135
 
0.7%
49133
 
0.7%
48174
 
0.9%
47195
 
1.0%
46245
 
1.2%
45286
 
1.4%
44353
 
1.7%
43351
 
1.7%

total_rooms
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5911
Distinct (%)28.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2636.504233
Minimum2
Maximum39320
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size159.8 KiB
2022-03-21T20:15:45.323240image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile622
Q11450
median2127
Q33143
95-th percentile6217
Maximum39320
Range39318
Interquartile range (IQR)1693

Descriptive statistics

Standard deviation2185.269567
Coefficient of variation (CV)0.8288511505
Kurtosis32.7138594
Mean2636.504233
Median Absolute Deviation (MAD)795
Skewness4.158816423
Sum53871691
Variance4775403.08
MonotonicityNot monotonic
2022-03-21T20:15:45.553010image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
152718
 
0.1%
158217
 
0.1%
161317
 
0.1%
212716
 
0.1%
147115
 
0.1%
205315
 
0.1%
160715
 
0.1%
172215
 
0.1%
171715
 
0.1%
170315
 
0.1%
Other values (5901)20275
99.2%
ValueCountFrequency (%)
21
 
< 0.1%
61
 
< 0.1%
81
 
< 0.1%
111
 
< 0.1%
121
 
< 0.1%
152
< 0.1%
161
 
< 0.1%
184
< 0.1%
192
< 0.1%
202
< 0.1%
ValueCountFrequency (%)
393201
< 0.1%
379371
< 0.1%
326271
< 0.1%
320541
< 0.1%
304501
< 0.1%
304051
< 0.1%
304011
< 0.1%
282581
< 0.1%
278701
< 0.1%
277001
< 0.1%

total_bedrooms
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1923
Distinct (%)9.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean537.8705525
Minimum1
Maximum6445
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size159.8 KiB
2022-03-21T20:15:45.801076image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile137
Q1296
median435
Q3647
95-th percentile1275.4
Maximum6445
Range6444
Interquartile range (IQR)351

Descriptive statistics

Standard deviation421.3850701
Coefficient of variation (CV)0.7834321252
Kurtosis21.98557506
Mean537.8705525
Median Absolute Deviation (MAD)162
Skewness3.459546332
Sum10990309
Variance177565.3773
MonotonicityNot monotonic
2022-03-21T20:15:46.009130image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
28055
 
0.3%
33151
 
0.2%
34550
 
0.2%
34349
 
0.2%
39349
 
0.2%
32848
 
0.2%
34848
 
0.2%
39448
 
0.2%
27247
 
0.2%
30947
 
0.2%
Other values (1913)19941
97.6%
ValueCountFrequency (%)
11
 
< 0.1%
22
 
< 0.1%
35
< 0.1%
47
< 0.1%
56
< 0.1%
65
< 0.1%
76
< 0.1%
88
< 0.1%
97
< 0.1%
108
< 0.1%
ValueCountFrequency (%)
64451
< 0.1%
62101
< 0.1%
54711
< 0.1%
54191
< 0.1%
52901
< 0.1%
50331
< 0.1%
50271
< 0.1%
49571
< 0.1%
49521
< 0.1%
48191
< 0.1%

population
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3879
Distinct (%)19.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1424.946949
Minimum3
Maximum35682
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size159.8 KiB
2022-03-21T20:15:46.241207image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile348
Q1787
median1166
Q31722
95-th percentile3284.4
Maximum35682
Range35679
Interquartile range (IQR)935

Descriptive statistics

Standard deviation1133.20849
Coefficient of variation (CV)0.7952636348
Kurtosis74.06088815
Mean1424.946949
Median Absolute Deviation (MAD)439
Skewness4.960016542
Sum29115941
Variance1284161.481
MonotonicityNot monotonic
2022-03-21T20:15:46.479310image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
89125
 
0.1%
105224
 
0.1%
85024
 
0.1%
122724
 
0.1%
76124
 
0.1%
82522
 
0.1%
78222
 
0.1%
100522
 
0.1%
87221
 
0.1%
75321
 
0.1%
Other values (3869)20204
98.9%
ValueCountFrequency (%)
31
 
< 0.1%
51
 
< 0.1%
61
 
< 0.1%
84
< 0.1%
92
< 0.1%
111
 
< 0.1%
134
< 0.1%
143
< 0.1%
152
< 0.1%
172
< 0.1%
ValueCountFrequency (%)
356821
< 0.1%
285661
< 0.1%
163051
< 0.1%
161221
< 0.1%
155071
< 0.1%
150371
< 0.1%
132511
< 0.1%
128731
< 0.1%
124271
< 0.1%
122031
< 0.1%

households
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1809
Distinct (%)8.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean499.4334655
Minimum1
Maximum6082
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size159.8 KiB
2022-03-21T20:15:46.698121image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile125
Q1280
median409
Q3604
95-th percentile1159
Maximum6082
Range6081
Interquartile range (IQR)324

Descriptive statistics

Standard deviation382.2992259
Coefficient of variation (CV)0.7654657774
Kurtosis22.094083
Mean499.4334655
Median Absolute Deviation (MAD)151
Skewness3.413850191
Sum10204924
Variance146152.6981
MonotonicityNot monotonic
2022-03-21T20:15:46.910643image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
30657
 
0.3%
33556
 
0.3%
28255
 
0.3%
38655
 
0.3%
42954
 
0.3%
28451
 
0.2%
37551
 
0.2%
29751
 
0.2%
27850
 
0.2%
38050
 
0.2%
Other values (1799)19903
97.4%
ValueCountFrequency (%)
11
 
< 0.1%
23
 
< 0.1%
34
 
< 0.1%
44
 
< 0.1%
57
< 0.1%
65
< 0.1%
710
< 0.1%
88
< 0.1%
99
< 0.1%
107
< 0.1%
ValueCountFrequency (%)
60821
< 0.1%
53581
< 0.1%
51891
< 0.1%
50501
< 0.1%
49301
< 0.1%
48551
< 0.1%
47691
< 0.1%
46161
< 0.1%
44901
< 0.1%
43721
< 0.1%

median_income
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct12825
Distinct (%)62.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.871161601
Minimum0.4999
Maximum15.0001
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size159.8 KiB
2022-03-21T20:15:47.187029image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.4999
5-th percentile1.60066
Q12.5637
median3.5365
Q34.744
95-th percentile7.30034
Maximum15.0001
Range14.5002
Interquartile range (IQR)2.1803

Descriptive statistics

Standard deviation1.899291249
Coefficient of variation (CV)0.4906256687
Kurtosis4.943141125
Mean3.871161601
Median Absolute Deviation (MAD)1.0649
Skewness1.644556916
Sum79099.445
Variance3.60730725
MonotonicityNot monotonic
2022-03-21T20:15:47.427087image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.12549
 
0.2%
15.000148
 
0.2%
2.87546
 
0.2%
4.12544
 
0.2%
2.62544
 
0.2%
3.87541
 
0.2%
3.37538
 
0.2%
437
 
0.2%
337
 
0.2%
3.62536
 
0.2%
Other values (12815)20013
97.9%
ValueCountFrequency (%)
0.499912
0.1%
0.53610
< 0.1%
0.54951
 
< 0.1%
0.64331
 
< 0.1%
0.67751
 
< 0.1%
0.68251
 
< 0.1%
0.68311
 
< 0.1%
0.6961
 
< 0.1%
0.69911
 
< 0.1%
0.70071
 
< 0.1%
ValueCountFrequency (%)
15.000148
0.2%
152
 
< 0.1%
14.90091
 
< 0.1%
14.58331
 
< 0.1%
14.42191
 
< 0.1%
14.41131
 
< 0.1%
14.29591
 
< 0.1%
14.28671
 
< 0.1%
13.9471
 
< 0.1%
13.85561
 
< 0.1%

median_house_value
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3833
Distinct (%)18.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean206864.4132
Minimum14999
Maximum500001
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size159.8 KiB
2022-03-21T20:15:47.675151image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum14999
5-th percentile66260
Q1119500
median179700
Q3264700
95-th percentile490560
Maximum500001
Range485002
Interquartile range (IQR)145200

Descriptive statistics

Standard deviation115435.6671
Coefficient of variation (CV)0.5580257394
Kurtosis0.3280374703
Mean206864.4132
Median Absolute Deviation (MAD)68400
Skewness0.9782898909
Sum4226860554
Variance1.332539324 × 1010
MonotonicityNot monotonic
2022-03-21T20:15:47.915226image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
500001958
 
4.7%
137500119
 
0.6%
162500116
 
0.6%
112500103
 
0.5%
18750092
 
0.5%
22500091
 
0.4%
35000079
 
0.4%
8750077
 
0.4%
27500065
 
0.3%
15000064
 
0.3%
Other values (3823)18669
91.4%
ValueCountFrequency (%)
149994
< 0.1%
175001
 
< 0.1%
225004
< 0.1%
250001
 
< 0.1%
266001
 
< 0.1%
269001
 
< 0.1%
275001
 
< 0.1%
283001
 
< 0.1%
300002
< 0.1%
325004
< 0.1%
ValueCountFrequency (%)
500001958
4.7%
50000027
 
0.1%
4991001
 
< 0.1%
4990001
 
< 0.1%
4988001
 
< 0.1%
4987001
 
< 0.1%
4986001
 
< 0.1%
4984001
 
< 0.1%
4976001
 
< 0.1%
4974001
 
< 0.1%

ocean_proximity
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size159.8 KiB
<1H OCEAN
9034 
INLAND
6496 
NEAR OCEAN
2628 
NEAR BAY
2270 
ISLAND
 
5

Length

Max length10
Median length9
Mean length8.063035286
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNEAR BAY
2nd rowNEAR BAY
3rd rowNEAR BAY
4th rowNEAR BAY
5th rowNEAR BAY

Common Values

ValueCountFrequency (%)
<1H OCEAN9034
44.2%
INLAND6496
31.8%
NEAR OCEAN2628
 
12.9%
NEAR BAY2270
 
11.1%
ISLAND5
 
< 0.1%

Length

2022-03-21T20:15:48.299308image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-03-21T20:15:48.467376image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
ocean11662
33.9%
1h9034
26.3%
inland6496
18.9%
near4898
14.3%
bay2270
 
6.6%
island5
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2022-03-21T20:15:39.457420image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:08.946598image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:11.634550image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:14.805845image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:18.274677image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:22.844844image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:28.926394image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:31.543029image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:34.537946image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:37.241070image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:39.689461image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:09.206581image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:11.898588image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:15.149444image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:18.490731image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:23.340973image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:29.142428image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:31.759083image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:35.050073image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:37.460224image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:39.897511image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:09.454646image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:12.162667image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:15.453518image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:18.914839image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:24.637297image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:29.590538image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:32.239205image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:35.298137image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:37.668037image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:40.218879image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:09.670724image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:12.426726image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:15.725588image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:19.538999image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:25.741573image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:29.806594image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:32.479278image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:35.506186image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:37.882575image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:40.422525image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:10.190873image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:12.642786image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:16.101695image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:20.123142image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:26.429758image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:30.046653image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:32.719326image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:35.722267image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:38.085958image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:40.651730image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:10.454896image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:12.943792image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:16.428796image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:20.763310image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:26.781833image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:30.318735image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:32.993569image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:35.978311image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:38.337141image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:40.887831image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:10.695627image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:13.461582image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:16.774341image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:21.275432image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:27.582034image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:30.542780image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:33.289634image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:36.274404image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:38.545169image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:41.114167image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:10.914339image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:13.780788image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:17.435941image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:21.532515image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:27.950126image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:30.774849image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:33.649722image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:36.522444image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:38.761242image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:41.346224image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:11.138397image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:14.089432image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:17.692142image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:21.852598image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:28.214192image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:31.006914image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:33.897785image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:36.770506image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:39.001303image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:41.610295image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:11.378459image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:14.459426image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:18.038324image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:22.324713image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:28.678329image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:31.286966image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:34.201859image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:36.994585image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-03-21T20:15:39.249347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-03-21T20:15:48.619410image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-03-21T20:15:48.931468image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-03-21T20:15:49.235544image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-03-21T20:15:49.539641image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-03-21T20:15:41.962399image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-03-21T20:15:42.404430image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexlongitudelatitudehousing_median_agetotal_roomstotal_bedroomspopulationhouseholdsmedian_incomemedian_house_valueocean_proximity
00-122.2337.8841.0880.0129.0322.0126.08.3252452600.0NEAR BAY
11-122.2237.8621.07099.01106.02401.01138.08.3014358500.0NEAR BAY
22-122.2437.8552.01467.0190.0496.0177.07.2574352100.0NEAR BAY
33-122.2537.8552.01274.0235.0558.0219.05.6431341300.0NEAR BAY
44-122.2537.8552.01627.0280.0565.0259.03.8462342200.0NEAR BAY
55-122.2537.8552.0919.0213.0413.0193.04.0368269700.0NEAR BAY
66-122.2537.8452.02535.0489.01094.0514.03.6591299200.0NEAR BAY
77-122.2537.8452.03104.0687.01157.0647.03.1200241400.0NEAR BAY
88-122.2637.8442.02555.0665.01206.0595.02.0804226700.0NEAR BAY
99-122.2537.8452.03549.0707.01551.0714.03.6912261100.0NEAR BAY

Last rows

df_indexlongitudelatitudehousing_median_agetotal_roomstotal_bedroomspopulationhouseholdsmedian_incomemedian_house_valueocean_proximity
2042320630-121.3239.2911.02640.0505.01257.0445.03.5673112000.0INLAND
2042420631-121.4039.3315.02655.0493.01200.0432.03.5179107200.0INLAND
2042520632-121.4539.2615.02319.0416.01047.0385.03.1250115600.0INLAND
2042620633-121.5339.1927.02080.0412.01082.0382.02.549598300.0INLAND
2042720634-121.5639.2728.02332.0395.01041.0344.03.7125116800.0INLAND
2042820635-121.0939.4825.01665.0374.0845.0330.01.560378100.0INLAND
2042920636-121.2139.4918.0697.0150.0356.0114.02.556877100.0INLAND
2043020637-121.2239.4317.02254.0485.01007.0433.01.700092300.0INLAND
2043120638-121.3239.4318.01860.0409.0741.0349.01.867284700.0INLAND
2043220639-121.2439.3716.02785.0616.01387.0530.02.388689400.0INLAND